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Abstract 

Motivated by questions of manifold learning, we study a sequence of random man¬ 
ifolds, generated by embedding a fixed, compact manifold M into Euclidean spheres of 
increasing dimension via a sequence of Gaussian mappings. One of the fundamental 
smoothness parameters of manifold learning theorems is the reach, or critical radius, of 
M. Roughly speaking, the reach is a measure of a manifold’s departure from convexity, 
which incorporates both local curvature and global topology. 

This paper develops limit theory for the reach of a family of random, Gaussian- 
embedded, manifolds, establishing both almost sure convergence for the global reach, 
and a fluctuation theory for both it and its local version. The global reach converges to 
a constant well known both in the reproducing kernel Hilbert space theory of Gaussian 
processes, as well as in their extremal theory. 
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1 Introduction 

This paper has two themes to it. One lies in the general area of the geometry of Gaus¬ 
sian processes, or random fields, over general spaces, and is about random embeddings. 
The second is more topological, and can be seen as putting probability measures on 
spaces of manifolds, and then studying the behavior of their reach. Both are motivated 
from recent results in manifold learning. 

1.1 Gaussian embeddings 

We start with parameter spaces which will always be m-dimensional, compact, smooth 
manifolds, without boundary, and which will be denoted by M. On M, we define a 
centered, unit variance, smooth, Gaussian process / : M ^ M, the distribution of 
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which is characterized by its covariance function C : M x M ^ M. Taking k > 1, we 
also define a M^-valued process 

= (/l(x),/2(x),--- ,/fc(x)), (1.1) 

made up of the first k processes in an infinite sequence of i.i.d. copies of /. It is not 
hard to check (under the mild side requirements that will be made formal later) that 
(1.1) defines, with probability one, an embedding (i.e. an injective homeomorphism) 
f^{M) of M into for all k > 2m + 1, akin to what one would expect from the 
Whitney embedding theorem. We call this a Gaussian embedding of M. 

It is easy to check that the diameter of f^{M) is 0{-\/k). Thus, to keep the 
embedding under control, we need to normalise it either by y/k, or self-normalise by 
defining 

'‘‘{d ^ Iphh, I € M, (1.2) 

where || • || is the standard Euclidean norm, and consider the embedding /i^(M), which 
now lies in the unit sphere in For reasons of notational convenience, this is 
the embedding that we shall consider in the current paper, although we could just as 
well have adopted a y/k normalisation without any qualitative changes in our results, 
although some of the details would be different. We call h^{M) a self-normalised, 
Gaussian, embedding of M. 

However, although all of M, / and the ambient spheres are smooth, it is not so 
clear how smooth these embeddings are going to be as A: —)• oo. On the one hand, the 
self-normalisation in (1.2) ensures that h^{M) lies in a fixed radius sphere. On the 
other hand, high-dimensional spheres are strange objects, with surface areas tending 
to zero as the dimension grows. Thus, given the increasing independence added into 
the mapping with each new / component, it is not at all a priori clear whether the 
embeddings eventually become rough, and perhaps fractal, or whether there is some 
sort of strong law behavior that leads to deterministic behavior in the limit. If the 
latter case is correct (which it is) then an associated fluctuation theory is called for. 

The main results of this paper resolve these issues, at least in the framework of the 
reach of the self-normalised Gaussian embeddings h^{M), as k ^ oo. 

1.2 Reach 

The modern notion of reach seems to have appeared first in the classic paper [7] of 
Federer, in which he introduced the notion of sets with positive reach and their asso¬ 
ciated curvatures and curvature measures. In doing so, Federer was able to include, in 
a single framework, Steiner’s tube formula for convex sets and Weyl’s tube formula for 
smooth submanifolds of M”. The importance of this framework extended, however, 
far beyond tube formulae, as it became clear that much of the theory surrounding 
convex sets could be extended to sets that were, in some sense, locally convex, and 
that the reach of a set was precisely the way to quantify this property. 

To be just a little more precise - a formal definition will be given below in Section 
2.1 - we start with a smooth manifold N embedded in an ambient manifold N. Then 
the local reach at a point x G is the furthest distance one can travel, along any 
vector based at x but normal to N in N, without meeting a similar vector originating 
at another point in N. The (global) reach of N is then the infimum of all local reaches. 
As such it is related to local properties of N through its second fundamental form, but 
also to global structure, since points on N that are far apart in a geodesic sense might 
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be quite close in the metric of the ambient space N. The reach of a manifold is also 
known as its ‘critical radius’ for a good geometrical reason described below, and we 
shall use both terms interchangeably. (See the paragraph following (2.8).) 

We shall give precise definitions in the following section, noting for now that beyond 
its importance in tube formulae and other classical areas of Differential Geometry and 
Topology, the notion of positive reach has recently begun to play an important role in 
the literature of Topological Data Analysis (TDA) in general, and manifold learning 
via algebraic techniques in particular. We shall discuss this briefly at the end of Section 
2 . 

1.3 Main results and structure of the paper 

With the terminology we have so far alluded to (but in most cases have yet to define 
rigorously) let 6{N,x) denote the local reach of a manifold N at the point x £ N, 
while 


r =t{N) = infe{N,x), 

xGN 

denotes the global reach of N. We, however, are interested in the reach of h^{M), and 
the main result of this paper is Theorem 3.3, which states that there is a deterministic 
function a‘^{f,x), x G M, such that, with probability one, and uniformly in x G M, 

cot^ (^e {h^{M),h^{x)Yj T^(/,x), (1.3) 

as fc —)• oo. An immediate consequence of this is the existence of a constant, denoted 
by o'c(/), such that the sequence of global reaches satisfies 

coi^ (t (h^{M)yj al{f) = supcj^(/,x). (1.4) 

While the notation regarding the various versions of cr^ is a little heavy, it is time- 
honored, since the constant cr^(/) has appeared previously in the extremal theory of 
Gaussian processes. In fact, one of the most interesting aspects of the convergence 
in (1.4) is the, a priori surprising, fact that cr'^if) is the limit. This constant had 
arisen earlier in a completely different context in [1, 21]. That context, described 
briefly in Section 3.2, related to rigorously proving the so called ‘Euler characteristic 
heuristic’, which approximates a wide class of Gaussian extremal probabilities via the 
expected Euler characteristic of their excursion sets. The role of the constant there is 
in quantifying the super-exponentially small error rate involved in the approximation. 
We shall discuss the importance of this constant in more detail in Section 3. 

Given the convergence in (1.3), it is natural to ask if an associated fluctuation 
result also holds. Indeed, this is the case, and Theorem 3.3 also gives us that 

Vk[cot^[e[h\M),h\-))) - a2(/,.)) (1.5) 

converges, in distribution, as A: —oo, to a limit which can be bounded by the supremum 
of a certain Gaussian process, the precise distribution of which is given much later in 
Theorem 11.1. 

The remainder of the paper is organised as follows: In the following section we 
have collected some general results about positive reach that were a large part of the 
motivation for our study. The reader uninterested in motivation can skip all but the 
definition of reach in Section 2.1. The reader interested in knowing more about the 
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history and applications of positive reach is referred to the excellent survey by Thale 
[22], or Chapter 7 of [4], which discusses reach in the context of TDA. 

Section 3 defines Gaussian processes on manifolds and associated notions such as 
the induced metric. It also introduces the constant cr^(/). Much of this section is a 
quick summary of the material in [1] needed for this paper, and once this is done we 
have everything defined well enough to state the main result of the paper. 

The real work starts in Section 4, in which we develop specific representations 
for the critical radius of a embedded manifold which form the basis of all that 
follows. Some of the results here already exist in the literature, and the proofs of these 
are relegated to an appendix. Some are new and full proofs are given in situ. Section 5 
lists four lemmas, from which, together with the representation of Section 4, the proof 
of the a.s. convergence in the main theorem follows easily. Following a brief section 
devoted to notation. Sections 7-10, which is where the hardest work lies, then prove 
these lemmas, one at a time. In Section 11 we turn to the fluctuation result of (1.5), 
both proving it and describing the limit process. Two technical appendices complete 
the paper. 

2 Critical radius and positive reach 

2.1 The definition 

Throughout the paper our underlying manifold M will satisfy the following assump¬ 
tions: 

Assumption 2.1. M is an m-dimensional manifold, compact, boundaryless, oriented, 
C^, and eonnected. 

Sometimes we shall assume that M is associated with a Riemannian metric g, and 
sometimes that it is embedded in a smooth Riemannian manifold {M,g). The main 
example that we shall need for this paper for an embedding space is the unit sphere 
M = S^~^, but we shall also meet the simple Euclidean case M = when discussing 
tube formulae below. In the first example, geodesics are along great circles, and the 
associated Riemannian distance is measured via angular distance. In the second, the 
geometry is the standard Euclidean one. 

As an aside, we note that all our results could be extended to the case of manifolds 
with boundary, and even stratified manifolds satisfying the kind of side conditions 
endemic to [Ij. However, then we would also have to suffer through all the heavy 
notation endemic to [1] , which seemed unnecessary, given that our primary motivation 
was to establish a general principle rather than the most general result possible. 

For the main result of the paper, all of the conditions in Assumption 2.1 are re¬ 
quired. This is not true for some of the lemmas along the way, but for ease of exposition 
we shall generally adopt all the conditions throughout the paper. For the fluctuation 
result, we shall even need that M is C®, and we will add that assnmption when needed. 
Of course, if the majority of the authors were topologists rather than probabilists, we 
would probably just have assumed that M is ‘smooth’ (i.e. C°°) and then not have 
been concerned with optimal levels of differentiability. 

We need the standard exponential map (cf. [15]) that maps tangent vectors to 
points on the manifold. This, for x € M and X £ TxM, the tangent space to x in M, 
is given by the local diffeomorphism 


expf(A:) = 7^,^^(11 a:ii), 
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where 'yx,'qx speed geodesic in M starting at x in the direction r]x = 

X/||X|| G S{TxM), the (sphere of) unit tangent vectors at x. The notion of reach is 
closely related to the radius of the largest ball around the origins in x G M, for 

which all the exponential maps are, in fact, diffeomorphisms. 

To give a more formal definition, let dM{x,y) {d^{x,y)) denote geodesic distance 
between points x,y £ M (g M), and for x G M and A C M set 

dM{x,A) = ml d{x,y), 
y&A 

with a similar definition for x G M and A C M. 

Then the local reach, or local critical radius, of M in M at x, in a direction 
r/ G S{TxM), is defined by 

9e{x,r]) = supjp ; djj (^exp^(p? 7 ),M) =p| . (2.6) 

Thus, if p > 9({x,r]), there is a point y ^ x in M which is closer to ex.p^{prj) than x 
is. The local critical radius of M in M at the point x is defined as 

9{M,x) = 9{x) = inf _ 9i{x,r]), (2-7) 

■n&T^Mns{T^M) 

where T^M is the normal space at x, of M in M. Taking an infimum over the entire 
manifold finally gives the global reach, or critical radius, of M in M; 

r(M) = r = 9{M,M) = ini 9{x). (2.8) 

x£M 

A more picturesque definition of reach, in the Euclidean setting for which M = M”, 
which also explains the terminology ‘critical radius’ is as follows: Imagine rolling a 
ball of radius r and dimension n over the manifold M, but in such a way that the ball 
only touches M at a single point. The largest choice of radius that allows this is the 
critical radius. 

For some examples in which M is a Euclidean space of codimension of least one 
with respect to M, note that if M is a convex set, then its reach will be infinite. In 
fact, infinite reach characterizes convex sets in this case. If M is a sphere, then its 
reach is equal to its radius. If M is the disjoint union of two spheres, the reach is the 
minimum of the two radii and half of the closest distance between the spheres. 

If M is itself a sphere, and M a great circle, then the reach of M (in angular 
coordinates) will be 7r/2. In general, the reach of a closed subset of a sphere will be 
no more than 7r/2. 

This is all you need to know about reach to skip to Section 3 and read the rest of 
the paper. The rest of this section is motivational. 

2.2 Medial axis 

An alternative way to think of reach is via the notions of the medial axis of M and its 
local feature size, notions which have been developed in the Computational Geometry 
community. Given M embedded in M, define the set 

G = |y G M : 3 xi / X2 G M such that d^{y, M) = dj^{y, xi) = dj^{y, X2)| • 

The closure of G is called the medial axis, and for any x G M the local feature size 
s{x) is the distance of x to the medial axis. It is easy to check that 

9(M,M) = inf s(x). 

x&M 
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2.3 On tube formulae 

As mentioned earlier, the birthplace of the notion of reach is Weyl’s volume of tubes 
formula, a classical result in Differential Geometry, and an extension of the much 
earlier Steiner’s tube formula for convex sets in M”. Interestingly, Weyl’s original 
paper [24] was motivated by a question raised by Hotelling [9] related to the derivation 
of confidence regions around regression curves. Both of these papers still make for 
fascinating (but not easy) reading today, and both generated enormous literatures, 
one mathematical (e.g. [8]) and one statistical (e.g. [10] and the literature referenced 
there). For its importance to Probability see, for example, [1] and the references 
therein. 

Restricting ourselves to the Euclidean setting for the moment, define the tube of 
radius p > 0 around M in M = to be 

Tube(M, p) = < a; e : inf ||y —x||<p 

[ y&M 

Then Weyl’s tube formula states that, for p < 0(M, M^), 

m 

Vol (Tube(M,p)) = (2.9) 

j=0 

where Vol is fe-dimensional Lebesgue volume, denotes the volume of a unit n- 
dimensional ball, and the Cj{M) are the Lipschitz-Killing curvatures of M. These 
are also known as quermassintegrales, Minkowski, Dehn and Steiner functionals, and 
intrinsic volumes, although in many of these cases the indexing and normalisations 
differ. It is worth noting, as Weyl established in what he considered the part of [24] 
that required more than “what could have been accomplished by any student in a 
course of calculus”, that these functionals are intrinsic. That is, they are independent 
of the embedding of M into M^. (See for example. Lemma 10.5.1 in [1], where this fact 
is given a probabilistic proof in the notation we use here.) 

It is hard to overstate the importance of (2.9), along with its variants for more 
general ambient spaces. The fact that the formula ceases to hold for p larger than the 
reach means that all the applications of tube formulae also fail at some point, and it 
is knowing where this point is that makes the reach such an important parameter of a 
manifold. 

2.4 Condition nnmbers, manifold learning and learning homology 

Standard manifold learning scenarios usually start with a ‘cloud’ of points X = 
{xi, • • • , Xn} in some high dimensional space, which are believed to be sampled from an 
underlying manifold M of much lower dimension m, with or without additional noise. 
(Additional noise will mean that the points need not lie on M itself, but rather are 
sampled from some region near M.) A classical problem is to construct a set which ap¬ 
proximates M is a useful fashion. This is a well known problem with a vast literature, 
and ‘useful’ here is usually taken to be mean physical closeness in some norm. 

More recently, a new literature has appeared, motivated by ideas from Algebraic 
Topology, in which the aim of physical closeness is replaced with the aim of correctly 
recovering the topology of M. Two of the earliest papers in this area are [17, 16] (but 
see also [6]) and it is these papers that were in fact the original motivators of the 
current one. 

In [17] the setup is that of a random sample from M, and the recovery method - 
or at least the theorems describing its properties - relies on knowing the reach r of 
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M. In this case, choosing an e S (0, t/2), the simple union of e-balls centered at the 
points of X is chosen as the estimator of M. That is 

Mestimate = IJ B,{x). (2.10) 


The arguments in [17] follow the two stage structure of Smale described above. Firstly, 
it shows that if one has a dense enough subset of points in M then M is a deformation 
retract of Mestimate, and so both sets have the same homology. For the second stage, 
it shows that if a large enough sample is taken then one can bound, from below, the 
probability of the sample being dense enough. The final result is that, for all small 6, 
if 


n > Pi 


log(/32) + log ( ^ 


where 

4”"vol(M) _ 8 ”^vol(M) 

^ a;m(ecos 7 i)”^’ ^ a;m(e cos 72 )™ ’ 


( 2 . 11 ) 


and 7 i = arcsin(e/ 8 r) and 72 = arcsin(e/16r), then the homology of MesUmate equals 
the homology of M with probability at least 1 — 5. A corresponding result for the case 
of sampling with noise is given in [16]. 

We have brought the above equations to show, explicitly, how the reach appears in 
the complexity of this estimation problem. The smaller the reach, is, the smaller one 
is forced to take e, and the larger the sample size n needs to be for a given estimation 
accuracy. 

Of course, for a given problem, one does not know what M is, and so, a fortiori, little 
is known about its reach. Consequently, in the spirit of Smale’s two step procedure, 
we need to enrich the second stage by also averaging over possible M. The current 
paper is a step in this direction, by formulating a class of random Gaussian manifolds 
and beginning a study of their reach. 

Moreover, the main result of the paper has an immediate application in the mani¬ 
fold learning situation. Although Theorem 3.3 relates only to a very specific random 
embedding of M into a high dimensional sphere, a liberal interpretation of the theorem 
implies that the part of the complexity of the estimation problem depending on reach 
is more or less independent of any embedding of M into a higher dimensional space. 
The import of this is that there is no ‘curse of dimensionality’, related to reach, that 
involves the dimension of the ambient space. 

Of course, we can only make these claims for the Gaussian-embedding that we 
study, but the fact that they are proven in the Gaussian case will alleviate concerns 
among practitioners, in general, that ambient dimension has an effect on reach. This 
was not known until now, even for a special case. 

A second practical implication of this paper is the introduction, albeit implicit, of 
a new class of smooth random manifolds that are both reasonable and mathematically 
tractable. Recalling the two stage paradigm of Smale above, it would be interest¬ 
ing, and probably useful, to introduce into the TDA setting the notion of Bayesian 
optimization. In terms of the above homology learning example, by this we mean 
minimizing not the probability of correctly identifying the homology for a fixed (but 
unknown) M, but rather minimizing the expectation of some cost function of this 
probability, averaged over a (random) family of possible M. The calculations of the 
current paper, along with those of [13] which address issues of the asymptotic isometry 
of Gaussian-embedded Riemannian manifolds, show that the model introduced here 
allows for tractable mathematical manipulation. 
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3 Gaussian processes on manifolds, and the main theo¬ 
rem 

As mentioned earlier, our basic reference for Gaussian processes is [1]. Here we shall 
only give the very minimum in dehnitions and notation needed for this paper. 

3.1 Gaussian processes on Riemannian manifolds 

We start, as usual, with a compact manifold M, with or without an associated 
Riemannian metric g. (For the novice. Section 6 explains these terms and some of the 
following notation.) 

A real valued Gaussian process, or random field, / : M —>• M, with zero mean 
(assumed henceforth) is then determined by its covariance function C : M x M —)• M 
given by 


C{x,y) = E{f{x)f{y)}. 

If C is smooth enough, the process also induces a Riemannian metric on the tangent 
bundle T{M) of M defined by 

gAX,Y) ^ E{{Xf){x) X {Yf){x)} = YyX,C{x,y)\^^^, (3.12) 

where X,Y are vector fields with values Xx,Yx in the tangent space T^M. We shall 
assume throughout that C is positive definite on M x M, from which it follows that g 
is a well defined Riemannian metric, which we call the metric induced by f. 

From now on, we shall make one of two - essentially complementary - assumptions: 

Assumption 3.1. If, in the above setting, we are given a manifold M as in Assump¬ 
tion 2.1 and a Gaussian process f : M —)• M, but no metric on M, we shall assume 
that M is endowed with the metric induced by f. 

If, on the other hand, we start with a Riemannian manifold {M,g), then we shall 
choose a Gaussian process in such a way that the metric induced by (3.12) is precisely 
9 - 


The fact that given a metric g there exists a Gaussian process inducing this metric 
is a consequence of the Nash embedding theorem (cf. proof of Theorem 12.6.1 in [1]). 

The only additional assumptions that we require relate to smoothness and non¬ 
degeneracy for /, but for this we need some notation. Thus we write, from now on, 
V for the Levi-Givita connection of {M,g), and for the corresponding covariant 
Hessian. Fix an orthonormal (with respect to g) frame field E = {Ei,..., Em) in 
T[M). The specific choice of E is not important. 

Assumption 3.2. We assume that the zero mean Gaussian process / : M —)• M has, 
with probability one, continuous first, second, and third order derivatives over M, and, 
for each X G M, the joint distributions of the /2)-dimensional random 

vector 


{fix), iiVf)iEi))ix), iiV^f)iEi,Efi)ix), 1 < i,j < m) 
are nondegenerate. 

We shall also assume thatE{f‘^{x)}, the variance of f, is constant, and for conve¬ 
nience, we take the constant to be one. No other homogeneity assumptions are required. 
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Regarding Assumption 3.2, we note that the requirement that / € C'^(M) is prob¬ 
ably not necessary. It arises as a side issue in a tightness argument in Section 9.3, 
which requires a uniform bound on increments of fourth order derivatives of C. A 
(much) more complicated argument would probably require only that / G for 

some e > 0, but rather than lose sight of the forest for the trees we are happy to live 
with the extra smoothness. In fact, in order to prove the fluctuation result (1.5), we 
shall even have to assume that / G C^{M). We shall explain how the need for these 
high levels of smoothness arise in a moment, when we have the requisite notation. 


3.2 The parameter ^lU) 

Given the above setting, we now define a new Gaussian process on 


M = {M X M)\ diag(M x M) 


(3.13) 


by setting 


r(y) 


f{y)-E{f{y)\f{x),Vf{x)} 
1 -C{x,y) 


(3.14) 


The fact that this process is well dehned is not obvious, since as y —)• x in (3.14) both 
numerator and denominator approach zero. Nevertheless, as we shall show in Section 
8.1, if we have enough smoothness for /, then the limit behaves well. For example, 
just to be certain that hm^^a, f^{y) is well defined requires that / G 

(In fact, ratios of the 0/0 nature appear throughout the proofs, with denominators 
such as 1 — C{x,y) (as above), 1 — C‘^{x,y), and even (1 — C^(x,y))^, all of which 
are problematic as y —)• x. For the the a.s. convergence of (1.4), this leads to the 
requirement that/ G C^{M). For the fluctuation result (1.5) we will even need to 
assume that / G C^{M). While these conditions seem, at first, rather severe, they 
seem to be necessary and not just a consequence of our method of proof.) 

In any case, / G C^{M) is more than enough to ensure that it makes sense to define 
the function adf, ■), and constant the adf), as follows: 


= sup Var(/"’(?/)), 

(3.15) 

y£M\x 


C^c(/) = supcj2(/,x). 

xeM 

(3.16) 


We now have everything we need to state the main result of the paper, but, first, 
we explain why the above two definitions are already ‘well known’. 

Associated with the Gaussian process / are a reproducing kernel Hilbert space, H, 
and an L 2 space, T-L, which is the completion of the span of / over M. Writing S{H) 
and S{'H) for the unit spheres of these spaces, there is an isometry, 'k between M, 
when given the metric g induced by /, and S{7i), determined by 'I'(x) = /(x), for all 
X G M. There are also isometries between S{'H) and S{H), and so between M and 
S{H), the details of which can be found, for example, in Chapter 3 of [1], but which 
date back to the earliest history of Gaussian processes. 

It turns out that iTg(/, x) is precisely the local reach of 'I'(M) at the point /(x), 
when S{'H) is considered as a submanifold of T-L. It follows immediately that cr^(/) is 
the corresponding global reach. Similar statements can be made about the isometric 
embedding of M into S{H), but would take longer to explain. The bottom line, 
however, is that both adf,') and oHf) are basic quantities inherently connected to 
{M,g) when it is viewed via isometric embeddings into larger spaces, and that there 
is a lot of Hilbert sphere geometry lying behind the asymptotics of this paper. 
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These observations are relatively recent. In our current notation, they can be found 
in Section 14.4.3 of [1], but see also [19] and the references therein. 

The reason that crdf, ■) and crdf) have been of more recent interest is that they 
arise in the rigorous justification of the so-called ‘Euler characteristic heuristic’ for 
approximating the distribution of the supremum of smooth Gaussian processes. In 
this setting, let x{^uif,M)) denote the Euler characteristic of the excursion set Au 
of the Gaussian held /, dehned by 

Au = Au{f,M) = {x £ M : f{x) > u}. 

It has been ‘well known’ for some decades that, at least for high levels u £ M, the 
mean Euler characteristic provides a good estimate of the exceedance probability, 
P {supa-g^;^/(x) > u}. That is, for large u, the difference 

Diff/,M(u) = E{x(^n(/,M))}-P| sup/(x) > u| 

xeM J 


is small. 

Relatively recently (cf. [1, 19, 21]) this statement has been made precise. (These 
sources actually treat the more general setting of stratihed manifolds, which requires 
an additional condition of local convexity for M, as well as some minor side conditions 
on both M and /. The dehnition of cr‘^{f) is also correspondingly changed. See, for 
example, [20] for a discussion of why local convexity is required. In fact, what is 
required is close to positive reach, and the reason that (3.17) fails for zero reach is 
much the same reason that tube formulae fail. But that is another story.) In our 
setting, it is now known that 

liminf-u"^log |Diff/M(^^)| > + (3-17) 

n^oo V O'cU)/ 


3.3 Main result 

With the introduction, motivation and almost all of the notation behind us, we are 
almost ready to state the main result of the paper. However, two more items of 
notation are required. The first gives the local radius of h^{M), as a submanifold of 
at the image, under h^, of the point x G M. This is given, for x £ M, by 


9k{x) = 


inf 


0i{h'^{x),T]), 




(3.18) 


where rj is a unit vector in the tangent space ^ pointing in a normal direction 

to h^{M) at h^{x) £ h^{M). The second gives the global reach, as 

9k = inf 9k{x). (3.19) 

x&M 

Theorem 3.3. Let M be a manifold satisfying Assumption 2.1, and let f : M —)• M 6e 
a Gaussian process satisfying Assumptions 3.1 and 3.2. Assume that a‘^{f), as defined 
by (3.16), is finite. Consider the embedding (1.2) of M into the unit sphere in 
and let 9k be the global critical radius of the random manifold h^{M). Then, with 
probability one. 


coi^ 6k —>• < 7 c(/)) ask^oo. 


(3.20) 
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If, in addition, M is C^, and the sample paths of f are a.s. on M, then there exists 
a sequence 7 ^ of random processes from M —)■ M, such that, for all x G M, 

y/k\coi^ ek{x) - al{f,x)\ < \%{x)\ , (3.21) 

and a limit process 7 : M —)• M, such that, 

%{■) => 7(-), (3.22) 

where the convergence here is weak convergence, in the Banach space of continuous 
functions on M with supremum norm, and 

7 (x) = sup j{x,y), 

yGM\x 

where 7 is the Gaussian process over M defined by (11.76). 

We defer all further discussion of the fluctuation result of (3.21) and (3.22) un¬ 
til Section 11, where it will be restated as Theorem 11.1, and the (rather involved) 
definition of the process 7 will appear. Until then we shall concentrate on the a.s. 
convergence of (3.20). 

As an aside, note that a variation of some of the easier arguments in the following 
sections show that the sequence of mappings tends, with probability one, to an 
isometry, in the sense that the associated pullbacks to M of the usual metric on 
tend to the induced metric (3.12) on M. We provide a rigorous treatment of this result 
in [13], albeit with the self-normalisation of (1.2) replaced by a y/k normalisation. We 
also prove there that this gives rise to the a.s. convergence of a class of intrinsic 
functionals of h^{M) to the corresponding functionals on {M,g). We refer you to [13] 
for details. 


4 Computation of the Critical Radius 


This section contains two purely geometric lemmas from which follow the probabilistic 
computations that make up most of the paper. The first gives a characterisation of the 
reach of general submanifolds of spheres, and the second does the same for the specific 
submanifolds hfi{M) in terms of the functions /^. To start, recall that geodesic distance 
on the sphere is measured in terms of angles, r G [0, vr). Let M be a submanifold of 
, and rix a unit normal vector at x G M. 

We can now state the following characterisation, which implicitly assumes, as we 
shall from now on, that M has dimension at least one. As stated it is identical to 
Lemma 2.1 of [19], restricted to our setting. ([19] treats the more general setting of 
stratified manifolds.). Furthermore, as pointed out there, the proof is essentially the 
same as the proof given in [10] for the one-dimensional case. Nevertheless, because of 
its importance to this paper, and (only) for the sake of completeness, we give the proof 
in Appendix 1. 

Lemma 4.1. Let M he a submanifold of S^~^, satisfying the conditions of Assumption 
2.1. Let T^M C he the normal space of M at x as it sits in S^~^, viz. the 

orthogonal complement of span{TxM (B {x}) C TxM.^ inTx^^. Then the critical radius, 
9{x), at X is given by 


cot^( 0 (x)) 


j/SMUo:} (1 - {x,y)Y' 


where Pt^m orthogonal projection onto T^M. 
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We are now in a position to derive the global reach of our random manifold h^{M). 
The result is given in the next lemma. However, before stating and proving the lemma, 
we need some preparatory definitions. 

Recalling the embedding maps and the components of (1.2), let {Xi ,.., Xm) 
be a frame bundle of full rank over M, and define the k x (m + 1) matrix 


Lx 


fi{x) Xifi{x) 
fk{x) Xifk{x) 


Xmflix) 

Xmfki^x) 


and the projection matrix 

Px = Lx,{LlLx^)-^Ll. 

By the independence of the fj and the non-degeneracy of Assumption 3.2, the rows 
of Lx are a.s. linearly independent, and so L^Lx is a.s. invertible. The matrix Px 
orthogonally projects vectors in onto 


span (^h^{x), h’^{Xi), 1 <i < , 


considered as a subset of Tf^k(^x)^^^ where : TxM ^ Tf^k(^x)^^~^ is the usual push 
forward operator. 

Consider now the following expression, well known from the Statistics literature 
as the maximum likelihood estimate based on k samples of the correlation coefficient 
between f{x) and /(y); viz. 


Ck{x,y) 


EU 


(4.23) 


Consider the conditional process f^{y) defined on M by (3.14) and denote k i.i.d. 
realizations of it at y by 

r-h!/) = (/?(!/).■■• 


Define an ‘error process’ 


E^'\y) = 


{l-C{x,y)f (1 


T\\p-r 


x,k 


(4.24) 


\\f^{yW {l-Ck{x,y)f 
The key lemma to be proven before starting probabilistic calculations is the following. 


Lemma 4.2. Let M be a manifold satisfying the conditions of Assumption 2.1, em¬ 
bedded into via the embedding map defined in (1.2). Assume that f satisfies 

Assumptions 3.1 and 3.2. Then, with probability one, the reach of h^{M) is given by 


cot^ 9k = sup sup 

x&M y£M\{x} 


{l-C{x,y)f 


(1 -Ck{x,y)f 


cx.k 


- E^^^{y). (4.25) 


Proof. The global critical radius is obtained by taking infima of local critical radii as 
given in (2.8). However, since the cotangent is decreasing in the first quadrant, we 
have 

cot^ Ok = sup cot^ 0fc(a:). 

xGM 
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Using the result from Lemma 4.1, the above is equal to 

sup sup 


xeMyeM\{x} (1 - {h^{x),h^{y))y 


(4.26) 


Since / is centered Gaussian, its derivatives are also centered Gaussians. Thus, the 
orthogonal projection Px{f{y)) of f{y) onto the space spanned by f{x) and Vf{x) is 

E{fiy)\f{x),Vf{x)}. 

This observation, along with (3.14), (4.26), and the fact that (I — Px)f^{y) = f^{y), 
show that 

(1-C(x,2/))2 


cot^ 9k = sup sup 


k 


Tiih-Pbr’‘(!/)ii 


ceM yeM\{x} ll/^(y)P (1 —Ck{x,y)Y 

From the fact that we have orthogonal projections, this is 


k (1 -C(x,y))2 /1 


||/-A(y)||2 


□ 


sup sup ,, ,,,n - 1 , 

xeMy£M\{x} ll/^(y)P (1 — Cfc(x, i/))2 \k 

and the lemma is proven. 

We shall see later that the error term E^’^{y) in (4.27) goes to zero, and so we shall 
be primarily concerned with the the a.s. convergence of 

(1 -C(x,y))2 f 1 . 


k 


xGMy&M\{x} ll/^(l/)P (1 — Cfc(x, i/))2 \k 


rWr^'ivW 


(4.27) 


For this, we need to establish convergence results for the three terms here. The results 
we need are stated as four lemmas in the next section. 


5 Four key lemmas and the proof of the main theorem 

The proof of Theorem 3.3 follows from the four lemmas stated below and is given at the 
end of this section. Throughout this section we shall assume, without further comment, 
that M satisfies the conditions of Assumption 2.1. The conditions on / vary, since not 
all the lemmas require the same level of smoothness. All the conditions, however, are 
implied by Assumptions 3.1 and 3.2. 

We start by showing that the first two terms in (4.27) converge uniformly, with 
probability one, to 1. 

Lemma 5.1. Let be a M.^-valued random process on M, with i.i.d. components, 
each a centered, unit variance Gaussian process over M, with a.s. continuous sample 
paths. Then, with probability one. 


lim sup 
k^ooy^M 


k 


- 1 


= 0 . 


Lemma 5.2. Let be as in the Lemma 5.1, but also . Denote the covariance 
function of its components by C{x,y), and let Ck{x,y) he as defined in (4.23). Then, 
with probability one. 


lim sup 

fc->oo. 

{x,y)eM 


l-C{x,y) 
1 -Ck{x,y) 


- 1 


= 0 . 
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The third lemma (after some trivial calculations) will - see below ~ give us that 
the remaining term in (4.27) converges to the parameter cr^(/). 


Lemma 5.3. Under the same assumptions as in Lemma 5.2, and with probability one, 


lim sup 





0 . 


(5.28) 


It will follow from the proof of this lemma that f^{y) is bounded even when we are 
arbitrarily close to diag(M x M). This is needed to ensure that all the terms defined 
in (4.27) are, a priori, well defined. 

The final step we need is the following. 

Lemma 5.4. Under the same assumptions as in Lemma 5.2, and with as 

defined in ( 4 . 24 ), we have, with probability one, 

lim snp = 0. 


We now show how to prove the main resnlt as a straightforward consequence of the 
previous four lemmas. 

Proof of Theorem 3.3. It is immediate from the results of Lemmas 5.1, 5.2 and 5.4 
that, with probability one. 


II 

lim cot^ = lim sup -;-, 

k^oo k^oo f k 

\xfy)<^M 


(5.29) 


and we shall be done once we show that the right hand limit is (7g(/). 

However, this is immediate from the much stronger result in Lemma 5.3 that 


lim sup 

{x,y)&M 




-Yai{f^{y)) 


= 0 


and that, by definition, 


c^c(/) = sup Yai {f^{y)). 


{x,y)£M 

This completes the proof of Theorem 3.3, modulo proving the four lemmas. □ 


6 Some (standard) notation 

Many of the proofs to follow freely use standard notation from Differential Geometry. 
Since we expect that not all readers will be familiar with this, we include here a brief 
notational guide. There are many standard texts to which one can turn for details. 
Lee’s book [15] is our favourite, but the quick and dirty treatment in Chapter 7 of [1] 
also suffices. 

We are working with a Riemannian manifold (M, g ), for which the Riemannian 
metric is, for each x £ M, an inner product Px on the tangent space TxM to M 
at X. If {{Ua,(l>a)}a IS an atlas for M, then for each chart {Ua,4>a) we shall often 
need a (local) orthonormal frame field X" = {Xf^,...,X((^} for the tangent bundle 
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T'^M = {TxM, X € I/q}, where orthonormality is in the metric g. We shall refer to 
this later as “choosing an orthonormal frame field”, without specific reference to charts 
or the index a. Since all our later definitions and calculations are local (i.e. can be 
carried out in terms of local charts) this is not a problem (and global issues such as 
parallelizability do not arise.) 

If F : M —>■ M is a smooth function, and X = Xx S TxM, then by 

XF{x), {XF){x), {XxF){x), XxF{x), XxFx, etc, 

we mean the derivative of F in direction Xx at x. At various times we will make use 
of all of these possible notations, so as to make individual formulae either clear and/or 
compact. 

As opposed to the above derivatives, the gradient, VF, of F is the unique contin¬ 
uous vector field on M such that 


gx{^Fx,Xx) = XxF, (6.30) 

for every vector field X. If F is a function of more than one parameter, say F{x,y), 
then we will denote the gradient with respect to x as XxF{x,y), etc. 

The (covariant) Hessian V^F is the bilinear symmetric map from C^(T{M)) x 
C^{T{M)) to C^{M) (i.e. it is a double differential form) defined by 

{V^F){X, Y) = V^F{X, Y) = XYF - VxYF = g{Vx^F, Y), (6.31) 

where, while V with no subscript denotes the gradient, when it is subscripted with a 
vector field, as in Vx, it indicates the the Levi-Civita connection of {M,g). 

It is standard that V^F could also have been defined to be V(VF), which is from 
where the notation comes. Recall that in the simple Euclidean case the Hessian is 
typically considered to be the N x N matrix Hp = {d‘^F/dxidxj)fj^-^. In the more 
general setting above, Hp dehnes the two-form by setting V^/(A, E) = XHpY'. In 
this case (6.31) follows from simple calculus. 

We shall need the obvious, but important, fact that if x is a critical point of F 
(i.e. VF(x) = 0) then XF{x) = 0 for all X e TxM and so by (6.31) it follows that 
V^F(A,y)(x) = XYF{x). Consequently, at these points, the Hessian is independent 
of the metric g. 

This concludes our brief excursion into notation. We can now turn to the proofs 
of the four lemmas of Section 4. 


7 Proof of Lemma 5.1 


We need to prove that 


lim sup 
k^ooy^M 



0, a.s. 


(7.32) 


However, this follows almost trivially from the following standard strong law for 
Banach space valued random variables, which, since we use it often, we state in full. 

Theorem 7.1 ([14], Corollary 7.10). Let X be a Borel random variable with values 
in a separable Banach space B with norm || • jj^. Let Sn be the partial sum of n i.i.d. 
realizations of X. Then, 

Sn a.s.^ Q 
n 

if, and only if, E{||Ar|| 5 } < oo and E{Ai} = 0. 
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To prove (7.32), we set X = f^(y) — 1 in the above theorem. The Banach space B 
is C{M) (continuous functions over M), equipped with the sup norm. The mean zero 
condition is trivial. To check the moment condition on the norm of /^ — 1, note that 


E < sup |/2(i/) - 1| ^ < 1 + E<^ sup |/2(y)| 




y&M 


< 1 + E ^ I sup |/(y)| ^ < oo. 

.y£M I 


Finiteness of the expectation here follows from the Borell-Tsirelson-Ibragimov-Sudakov 
inequality (e.g. Theorem 2.1.2 in [1]). This is all that is needed to prove (7.32). 


8 Proof of Lemma 5.3 


Before starting this proof in earnest, we need to check that all the terms that are im¬ 
plicitly assumed to exist in the statement of the lemma are well defined. In particular, 
we need to consider the limits 


lim f^{y) 

y-^x 


^.^ f{y)-E{f{y)\f{x),Vf{x)} 
1 — C{x,y) 


(8.33) 


the problem being that both numerator and denominator tend to zero in the limit. If 
(8.33) is not well defined, then the supremum in the lemma makes no sense. (Note 
that away from the diagonal in M x M there is no problem with either boundedness 
or continuity, due to the assumed smoothness of /.) 


8.1 The limit (8.33) is well defined 

The proof is basically an application of L’Hopital’s rule. To start, we take an orthonor¬ 
mal frame held X = {Xi,..., X^} for the tangent bundle of M, where orthonormality 
is in the induced metric g of (3.12) and with the conventions described in Section 6. 

Then standard computations for this situation (cf. Section 12.2.2 of [1] for pre¬ 
cisely this case) give that the vector {f{y), f{x), V/(x)) has a mean zero, multivariate 
Gaussian distribution with covariance matrix 

1 C{x,y) X^C{x,y)' 

C{x,y) 1 0 

yxC{x,y) 0 1 


From this and the dehnition of Gaussian conditional expectations, it immediately 
follows that 


r(y) = fix) + 


fjy) - fix) 

1-C{x,y) 


E 

2=1 


Xif{x)XiC{x,y) 
1-Cix,y) 


(8.34) 


Now take any X = YllLi diXi G T^M, and let c be a curve in M such that 


c : (—5, (^) —)• M, c(0) = X, c(0) = X. 


As y —^ X along this curve, we have 


lim f^iy) = lim 

y—>-x u—^0 


fix) 


( <if{c(u)) y f (^\ dXiC{x,c{u)) 

du 2^i=i 


dC{x,c{u)) 

du 


(8.35) 
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Consider the limit of the ratio in the above expression, this being the only prob¬ 
lematic term. This is 


{Y.idiXi)C{x,x) 


(8.36) 


Note that because of our choice of Riemannian metric, and the fact that the Xi were 
chosen to be orthonormal, we have 


XjXiC(^x, x') — g(^Xi,Xj) — dij, 


the Kronecker delta. Therefore the numerator in (8.36) is zero. The denominator is 
also zero because of the assumption of constant variance on /. Thus, to find the true 
limit, another application of L’Hopital’s rule is necessary, and so we have 


lim r(y) 

y-^x 


It is easy to see that 


f{x) - lini 


I diL-^ 


Z^iXiJ{x) 

d?C{x^c{u)) 

du"^ 


1 d^f{c{u)) ^ ^ XxXfix), 

^^■0 au^ 


and 


lim 


(fXiC{x, c(u)) 
du^ 


XXXiC(x,x) 

E{XXfXif} 

E{XxXfXJ} 

g{XxX,Xi), 


where in the second-last equality, we have used calculations from [1] (cf. Eqn. (12.2.14)). 
Consequently, 

^ ' dv? 


Y,9{^xX,Xi)Xif{x) = VxXfix), 


and so, moving to the notation of 2-forms, the limit in (8.35) is given by the well 
defined expression 

_ V^f{x){X,X) 

’ v^c{x,x){x,xy 

and the limit in (8.33), albeit dependent on the path of approach of y to x, is also well 
defined. As a consequence, we also have that, for each finite k, 

iir’‘(!/)p 


sup 

(x,y)eM 


k 


is a.s. finite. 


8.2 Completing the proof 

We now turn to the proof of the lemma, establishing (5.3). 

This, however, follows exactly along the lines of the proof of Lemma 5.1, again 
applying Theorem 7.1. We need only take as our Banach space Cfe(M), the bounded, 
continuous functions on M with supremum norm, and as our basic random variable 
= {r{y)f - Var {r{y)). 

The previous subsection establishes the a.s. boundedness of X needed to make the 
argument work. 
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9 Proof of Lemma 5.2 


Lemma 5.2 involves showing that the ratio 


1 -C{x,y) 
1 -Ck{x,y) 


converges, uniformly, to one, as /c —)• oo. For given x y, this is straightforward, 
following from a strong law of large numbers, much as in the previous two proofs. 
However, as x ^ y, even for fixed k, there is no easy way to find a uniform bound on 
the ratio, since both numerator and denominator tend to zero. 

9.1 Outline of the proof 

We start by writing C as a sum of three terms: 

Ck{x,y) = C{x,y) +Bias{Ck{x,y)) + ^k{x,y), 

where y) is mean zero, random error with variance Var(Cfc), and the deterministic 
bias term is IE{Cfc — C}. 

We shall show in Appendix 2 that 



(9.37) 


(9.38) 


and that the remainder terms in both expressions are uniformly bounded over M x M. 
(In fact, this is almost classical, in that expressions for the bias and variance of the 
correlation coefficient estimator centered around the sample means (as opposed to C, 
which is centered at zero) are well known in the Statistics literature, dating back, at 
least, to [12] [Chapter 16, see (16.73) and (16.74)]. Appendix 2 treats the zero-centered 
C case.) 

For notational convenience, set 



(9.39) 


Since the notation is getting long, from now on, we interchangeably use ak{x,y) and 
for a function of x and y, refrain from writing out explicitly the summation 
indices and their range in some situations where they are obvious, and also introduce 


= Bias(Cfc(a:,y)). 


Then, in view of (9.37), and up to a term of 0{k ^) in the denominator, we have 



(9.40) 


1 -c^y 



Cancelling (1 — C^y) from numerator and denominator, this becomes 
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1 + 




— Vk- 


-1 


1+c^y 


1 - {C^yy 


2{k + l) 

The only problematic term here is 

ACk{x,y) -C{x,y) - (3k{x,y 
Vk- 


y/k 


(9.41) 


1 -C2(x,y) 


(9.42) 


since the second term converges deterministically to zero, and the final multiplicative 
factor is bounded by 2j\fk. We shall prove that the sequence of random processes 
defined by (9.42) converges weakly to a continuous Gaussian process on M. This, the 
extra divisor of \fk in (9.41), and some elementary probability arguments which we 
leave to the reader, will be enough to prove Lemma 5.2. 

In fact, in view of (9.37), we can drop the bias term from (9.42), and suffice with 
the weak convergence, over M, of the processes 


Ck{x,y) 


A 


Ck{x,y) 

l-C‘^{x,y) 


rrCkix.y) -C(x,y) 


(9.43) 


We shall prove this in a number of stages. 

To start, we show the weak convergence of the numerator in (9.43) - Ck ~ which is 
much less delicate than that of the ratio Cfc, there being no 0/0 issues. The convergence 
of the finite dimensional distributions is shown in the following Section 9.2 and the 
tightness in 9.3. The final step is to apply the continuous mapping theorem, (e.g. [3], 
Section 1.5) for which we need to know that the mapping between function spaces that 
takes 


(j){x,y) 


(t>{x,y) 

1 -C2(x,y) 


(9.44) 


is continuous, with probability one, for the process on M x M which is the limit of 
the Cfc- We have already seen that ratios like that on the right hand side here are prob¬ 
lematic, and computable, at the y ^ x limit, only via L’Hopital’s rule. Consequently, 
the weak convergence of the Cfc is going to have to be in a function space with a norm 
that takes into account convergence of derivatives as well as the function values. This 
is going to make the tightness argument rather intricate, which is why Section 9.3 is 
the longest in the paper. The continuous mapping argument will be given at the end, 
in the brief Section 9.4. 


9.2 Fi-di convergence of (ki and characterising the limit 

The main result of this section is the following. 

Lemma 9.1. The finite-dimensional distributions of ^k; on M x M, converge to those 
of the zero mean, Gaussian process, C, with covariance function given by 

E{C{xo,yo)Cix, y)} = ic^^y^c^y [{Qy^^f + [cy°yf + + [c^^yf] 

-\-C^y° _|_ ^yoy 

_Qxoyo ]^X0XQX0y _|_ QyoyQxyoT^ ^ ( 9 . 45 ) 
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The proof will rely on the following result of Anderson. 

Theorem 9.2 ([2], Theorem 4.2.3). Let {U{k)} be a sequence of d-component random 
vectors and b a fixed vector such that y/k{U{k) — h) has the limiting distribution M {Q^T) 
as k ^ oo. Let g{u) he a vector-valued function of u such that each component gj{u) 
has a nonzero differential at u = b, and let fib be the matrix with {i,j)-th component 
{dgj{u)/dui)\u=b- Then y/k{g{u{k) — g{b)) has the limiting distribution Mfd^fi'fiTfiif). 

Proof of Lemma 9.1 As one might guess from the complicated form of (9.45) 
the calculations involved are somewhat tedious, and so we shall concentrate on making 
the main steps clear. Towards that end, we introduce the following notation just for 
this proof. For any £ N, and points {xi,yj) £ M x M, define 


C^l’^^ik) = \\fHx^ 


= \\f{y, 


^12 


(^) = '^Mxfifiiyj). 

£=1 


Now define 


U{k) = 




^i,yi 


5 *^22 5 


'-’12 ) 


..., u 


^n^Vn /-y^n^yn /—1^n^yn'\ 

^22 i*-^1 


11 


'12 




Thus, the elements of U are the maximum likelihood estimators of the corresponding 
elements of b. It then follows from standard estimation theory (e.g. [2], Theorem 3.4.4) 
that y/k{U{k) — h) has a limiting normal distribution with mean 0 and some covariance 
matrix T, the specific structure of which does not concern us at the moment. 

In order to prove the lemma, we require the asymptotic distribution of 

^Vk{Ck{xi,yi) -C(xi,yi))|^_^ . (9.46) 

However, using the vector Lf above it is easy to relate the Cs to the Cs, and if we now 
define a function g : —)• M"' by 

’ y/U'in-lU'in-2 ) ’ 

then Theorem 9.2 establishes the claimed convergence of finite dimensional distribu¬ 
tions, and so proves the lemma, modulo two issues. 

The first is the condition on the differential required by Theorem 9.2, but this 
is trivial. The second is to derive the exact form (9.45) of the limiting covariances, 
which, while not intrinsically hard, is a long and tedious calculation. The calculation 
starts by writing out the covariance function for C and computing moments, all of 
which involve Gaussian variables. Fortunately, most of the detailed calculations that 
we need were carried out long ago in the statistical literature and, can be found, for 
example, in [II] [e.g. Chapter 41, Example 41.6]. What remains is to send k ^ oo 
in these expressions, and deduce (9.45). We shall not go through the tedious details 
here. □ 


g{ui,U2, ■ ■ ■ ,U3n) = 




Uq 


y/ufuE 


9.3 Tightness of fk 

For the reasons alluded to above and exploited below, we shall prove tightness in the 
Banach space of twice continuously differentiable functions on M x M, which we denote 
by equipped with the norm 

||/||b(.) =max{||/||o„ ||V/||oo, liVViloo}, 


(9.47) 
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where the norms of the first and second order derivatives are obtained by taking max¬ 
imum over the norms of the 2m and 4m^ components of the Riemannian differential 
and Hessian, respectively. 

To break the rather long proof of tightness into bite sized pieces, we write 

Ck{x,y) = ak{x,y) + Ak{x,y), (9.48) 


where 


ak{x,y) = 




and 




C{x,y) 




In the following two subsections we shall prove that the sequences and 
converge weakly in from which the convergence of Cfc immediately follows. 


9.3.1 ak converges vyeakly in 

We start with something even simpler than at, viz. the sequence of random functions 
rjk defined by 

Vk{x,y) = Vk (^^^fj{x)fj{y) - C{x,y)^ . (9.49) 

To prove the weak convergence of this sequence, we use the theorem stated as part of 
Example 1.5.10 in [23][p41] (also see the discussion after the statement of the theorem). 

To this end, note that the summands in (9.49) are i.i.d. copies of the random 
function /(8>/ : M x M —)• M defined by {f ® f){x,y) = f{x)f{y). If we endow M x M 
with the topology induced by the Riemannian distance (ImxM (this is the metric we 
use in place of the semi-metric in the theorem from [23]), then M x M is compact in 
this topology. Since the convergence of the finite dimensional distributions of (9.49) 
follows from Theorem 9.2, all that is left to check for weak convergence of (9.49) is 
tightness. 

In order to show tightness, we first need to set up some notation, in particular for 
Taylor expansions on M x M, in terms of Riemannian normal coordinates. 

Consider normal neighbourhoods Ui,U 2 (local coordinates (x*),(y*), respectively) 
around xq and yo in M, and take Ui x U 2 around (xo,yo) va. M x M. Then, (x* : y*) 
give us the following definition of coordinates in the product space Ui x U 2 - 

ix\...,x^:y\..., y™)(xo, yo) = [(x\ ..., x™)(xo) : (y\ ..., y'”)(yo)] • 

Since 

T{xO,yo)i^i X U 2 ) = T^gUl © TyQU2, 

any tangent vector v in the product tangent space splits uniquely as the sum of vi G 
TxqUi and V 2 G Tyf^U 2 - This further gives us the following definition for the exponential 
map in M X M: 

expg„^,^)(^) = (exp^(ui),exp^(u2)). 
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Let the coordinate basis vectors for the tangent spaces be denoted by (^) and j j 
considered as row vectors. The concatenation of the two serves as a basis for the 
product tangent space, and so any vector v in this space can be written as 


V = Vl®V2 


E 


;_ 8 _ 


+ 


m 

E 

2=1 


yi+rn 


A 


This allows us to write the following Taylor expansion for C{x,y) about {xo,yo): 


C{x,y) = C{xo,yo) + v 


dC{x,y) dC{x,y) 
dx^ ’ ’ ’ 


+ b 


d‘^C{x,y) 

d^Xid^yj 


dym 

\{=^o,yo) 


l(a:o,yo) 


), 


where k + l = 2. Finally, we recall a few important facts from the topic of normal coor¬ 
dinates and geodesics (cf. [15]) that the geodesic starting from (xo,yo) ia the direction 
V is given in Riemannian normal coordinates by t{v^ ■ ■ ■ geodesics are locally min¬ 
imizing, and so along with the previous fact, we have ||u|p = d\jy^j^{{x,y), {xo,yo)). 
Also, importantly, since Christoffel symbols vanish at the centers of the normal charts, 
covariant derivatives at the centers reduce to usual partial derivatives. Therefore, 
working with normal coordinates is useful in local calculations. 

Returning now to the issue of tightness, we need to establish moment bounds on the 
second derivatives of the processes % of (9.49). Clearly, the variance and correlation 
function of %, are, respectively, the variance of f{x)f{y) — C{x,y) and the correlation 


E{{f{x)f{y) -C{x,y)){f{xo)f{yo) -C{xo,yo))}. 


To investigate second derivatives, it is useful to move to the notation of 2-forms. 
Doing so, it follows from simple algebra that the diagonal elements of the Hessian 
matrix of this process are given by 


vV(x) 


A A 

dx^ ’ dx^ 


fiy) - 


A A 

’ dx^ 


d d 


dy^ dy^ 


^ fiy )ITT /(^) - ITT 


d d 


dy^ dy^ 


l<i,j <m, 


with other elements in the upper triangular portion falling into one of the three groups 


vV(x) 


/A A 

\ dx'- ’ dx^ 


d d 


f{y)-v‘^c^y 




A A^ 

’ dx^ ) ’ 


d d 


dy^ ’ dy^ 


1 < i < j < m, 
1 < * < j E w-, 


or 


dfix) df{y) 


dx^ dy^ 




d d 


1 < i < j < m. 


’ dy^ 

For the sake of illustration, we focus on only one ‘type’ of term. Computations for the 
other terms are basically the same. The term we shall consider is 


V' 




( d d \ 

’ dx"^ j ’ 


and we now also note that the term involving the derivatives of C does not present any 
problem for the upper bound on the increments since C is deterministic and the fact 
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that f G implies that C is at least C®. Thus, it is enough to prove the following 
bound 


E 



d d \ 
dx^ ’ dx^) 


f{y)-V^f{x) 



<Kdlj xmH^i y)i (^0) yo)) 


for any two pairs {x,y), (xo,yo) G M x M, and constant K. The expectation here is 
bounded above by 


2E 


dx'^dx^ 


f{y) 


dx'^dx^ 



(9.50) 


+2E 





As far as the first expectation here is concerned, using Wick’s formula, the fact that / 
has unit variance, and, for a differential operator D of any order, writing for 

yg), we have that it is equal to 


\Anxx 


dX 


+ 


QiQXQXo 


- 2 - 


qAqxxq 


d{x‘^Yd{x^Y d{x‘^Yd{x^Y d{x‘^Yd{x^Y 


.Qvyo 


(9.51) 


+ 


/ Q 2 Qxy Y 


+ 


/ Q2Qxoyo\^ Q2Qxyo Q2Qxoy 


\ dx‘^dx^ J 


- 2 


dx'^dx^ dx‘^dx^ 


The important point to be checked is that terms which are 0(1) and 0(||u||) cancel. 
We check this thoroughly below. The second order terms can be trivially bounded 
using the facts that f G and |uj| < ||u||. This technique of bounding gives the 
required constant K independent of the points, but does not offer too much insight. 
Consequently, we illustrate how this can be done for one case only. 

Consider the case 


' Q'iQXXQ ^ Q'iQXXo 


Q^Qyyo 


i',3 'y=yo 




+ 


d^cyyo 


dy'^d{y 


i\ 2 \y=yo 




2m 


The above is obviously smaller than 


qZq^xxq 


=T.C\ V 


9(^l)3l-=-o 


+ ••• + 


Q'iQXXQ 


dx^d(x 


l'l2 l^=^0 


=xnX 


X 


Q'iQyyo 


d{y^) 


i'i3 'y=yo 


=11^ 


,m+l 


+ ••• + 


Q'iQyyo 


dy'^d{y 


i'i 2 \y=yo 


=Hr\ ^ 


,.2m 


This immediately yields the following upper bound, in which M 3 is a bound on the 
third order derivatives of C: 


M|[|u^| + • • • + \v^\] X [|u”^+^l + • • • + |u2™|] 

< Mlm^dl[^j^{{x,y),{xo,yo)) = Mlm‘^\\vf. 

With the second order terms out of the way, we return to our claim that the zeroth 
and first order terms cancel out in (9.51). Focus first on the second term in that 
expression. For any (x, y) G M x M, introduce the function 

, , A d‘^c^y 

9{x,y) 


dx'^dx^ 
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which, by assumption, is at least C^. Expanding this in a Taylor series about (xo,yo)) 
we have 

9{x,y) g{xo,yQ) + 2_^ i \{xo,yo)'^ + q i \{xo,yo)'^ + 0(||u|| ). 

i=l ^ 


2 = 1 


In shorter notation, let us write the above as 


2m 


9{x,y) = 5(xo,i/o) + 

2 = 1 

where the e* are the coefficients from the previous formula. Then, 

/ pp.fxy \2 /o2^xo?/o\2 „ ^HL 

(s^w) = 2(5(3;o,yo)) +2g{xo,yo)Y,eiV^ + 0{\\v\\). (9.52) 

Next, define the following smooth functions over M: 


,, , A 
h[x) = 


dx'^dx^ ’ 


A 


It is immediate that 

m 

h{x) = g{xo,yo)+ ^ey+ 0{\\v\\^), t{x) = g{xo,yo) + ^ eiU* + 0(||u||^). 


2m 


2=1 


i=m-\-l 


Therefore, 


2m 


a2nxyo pfnxoy 

- 2 „ , = -2[5(^(3:o, 2/o) + ^(aJo, yo) ^ CiU*] + 0(||u|p). (9.53) 

2=1 


dx'^dx^ dx'^dx^ 


It is now clear that, as claimed, at least for the second expression in (9.51), the zeroth 
and first order terms in the Taylor expansion cancel (cf. (9.52) and (9.53)). 

Turning now to the first term in (9.51), define a function w : diag(M x M) —^ M 
by 

A 


w{x,x) = 

and a function on a : M —)■ M by 


a[x) = 


d{x‘^Yd{x^Y ’ 

QA:QXX0 


d{x‘^Yd{x^Y' 

These admit the Taylor series expansions 

W{X,X) = w{xo,Xo) + 2^ l(xo,xo)^' + 0(ll^lP)^ 

2=1 

and 


l{x) = w{xo,Xq) + ^ \ixo,xo)'l’' + )• 


2=1 


Noting that the Taylor series expansion of about yo is 
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it is easy to see that here also, only the terms starting from the second order remain. 
Again, following the same basic lines as in the previous argument shows that a similar 
upper bound holds for the second expectation in (9.50). From our earlier discussions, 
we are therefore done regarding proof of tightness of (9.49). 

In addition, since by Lemma 5.1, we know that \/'^{fj{x))‘^/k converges, uniformly 
over M, and with probability one, to 1, we have (e.g. [3] [Theorem 4.4]) the joint weak 
convergence of the pair 

Given this, the continuous mapping theorem immediately yields the weak conver¬ 
gence of afc, as required. 



9.3.2 Afc converges weakly in 

Recall the expression for A^: 


Vk \ 1 — 


k 




k 


C{x,y) 




(9.54) 


We have already seen that the denominator in the rightmost ratio here converges a.s., 
and uniformly, to one, and so a simple application of Theorem 4.4 from [3] and the 
continuous mapping theorem imply that we need only concern ourselves with the weak 
convergence of the sequence of processes T^. defined by 


^k{x,y) 




(9.55) 


To prove this convergence, we shall, for large enough k, bound T^ from above and 
below by two sequences of processes, which converge to the same limit. These bounds 
(cf. (9.57) below) involve a common term, the weak convergence of which is known, 
and a smaller term, which converges a.s. and uniformly to zero. 

The bound depends on the following algebraic inequality, due to Cartwright and 
Field [5]. 

Theorem 9.3 ([5]). Let Wi, 1 < i < n be numbers summing to 1. Let Xi be positive 
numbers in [a,b] (0 < a < b), whose arithmetie and geometric means are denoted by 
AAIw and GM^, respectively. Then, 

^ ^ Wi{xi - < AM,„ - GMy, < ^ ^ Wi{xi - AM,„f. 


To apply Theorem 9.3 we note first that we know that k~^ S(/i(®))^ converges to 
1 a.s. and uniformly. Thus, given any e > 0, there exists a (random) ko such that, for 
all x G M, and all k > ko, 

l-e<i^/2(x) < 1 + e. 

Now apply the theorem, assuming that k > ko, taking n = 2, [o, b] = \1 — e,l + e], 
wi = W 2 = 1/2, and 

xi = '^fj{x)/k, X2 = '^fj{y)/k. 
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Setting 


Nkix) = Vk(^'^fj{x)/k - 1 ^ , 


a little algebra leads to 


(9.56) 


^^{Nk{x) + Nk{y)) 


{Nk{x) - Nk{y)f 

4(1 — £)'/k 


< rfc(x,y) < \{Nk{x) + Nk{y)) 


{Nkjx) - Nk{y))‘^ 
4(1 + £)Vk 


(9.57) 


But this is precisely the inequality that we described above. Although it would be 
straightforward to establish it independently, the weak convergence of has already 
been proven in Section 9.3.1, since is just the process (9.49) over diag(M x M). 
From this immediately follows the weak convergence of the processes from M x M ^ M. 
dehned by {x,y) Nk{x) + Nk{y) and {x,y) Nk{x) - Nk{y). 

This completes the proof of the weak convergence of A^. 


9.4 The continuous mapping argument 

To complete the proof of Lemma 5.2, we exploit the fact, proven in the previous two 
sections, that Ck converges weakly in to the Gaussian process C with covariance 
function given by (9.45), and use it to show that the ratio processes 

= 1 -chL) 


converge weakly in Cb{M). 

As described at the beginning of this section, this follows immediately from an 
application of the continuous mapping theorem, once we show that the mapping H : 
^( 2 ) Cb{M), defined by 


{H{4>)){x,y) = . (9.59) 

1 -C^{x,y) 

is continuous, with probability one, for the probability measure supported on the paths 
of C- 

Recall that C is at least over M x M because of weak convergence in The 
same (and more) is true for the covariance function C, so the issue of continuity of H 
is trivial if we restrict 7 to a region away from the diagonal of M x M. 

So the only question remaining is what happens as y —)• x. What we shall now do 
is investigate the limits 


1 — C‘^[x, y) ’ 

and show that they depend only on ratios of well defined functions of the second 
derivatives of C and C. This will immediately imply the continuity of H, and thus 
complete the proof of Lemma 5.2. 

To this end, take G T^M, and a (7^ curve c in M such that 


c : (—5,5) —)• M, c(0) = X, c(0) = Xx- 
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Then, using the symmetry of ( and C, 


C{x,y) ((cN,^) 

hm- -777 -r = hm- , — 7 . 

I — C^(^x,y) u^o 1 — {c{u), x) 

It follows from (9.45) that the limit of the numerator is zero, while the same is true of 
the denominator since C{x,x) = 1. 

By L’Hopital’s rule, the limit above is equal to 


lim-- 

The denominator here is easily seen to be zero, since x = y is a critical point for C{x, y) 
and C is differentiable. To check that the same is true for the numerator, note that C is 
differentiable, with zero mean and covariance function given by the second derivative 
of the covariance function of C,. That is. 


E{iX,ax,y)?} 


Xy^Xy^E ^C{x,yi)Cix,y2)'^ 


yi=y 2 =x 


Using the specific form (9.45) of this covariance, and denoting for Xyj^CX'^^\y^=xi, 

we have that 


Xyi'^{C{yi,x)Ciy2,x)}\ yi=xi 

Xx — (c^^^)'^Xx —cy^^Xx 


Taking the additional derivative Xy ^, and then setting xi = X 2 = x gives 

2{V^C{x, x)iXx,Xx) - V^C{x, x)iXx,Xx)) = 0 . 

Thus, since the variance here is zero, y = x is indeed a critical point of C(y, x), and 
so to evaluate the limit in (9.60) we need yet another round of L’Hopital’s rule. This 
gives us that the limit is equal to 


lim 


-2 


(P(!^{c{u),x) 


dC{c(u),x) 

du 


C(c{n).x)^£^ 


-V^Cix,x)iXx,Xx) 

2V^Cix,x)iXx,Xx) 


the equality here following from the fact that y = x is a critical point for both (^(y,x) 
and C{y, x). 

However, all terms here are well defined, finite, and non-zero with probability one, 
so we are done. 


10 Proof of Lemma 5.4 


To prove the lemma, we need to show that the sequence 


sui^.U’"’^(y) 

x,y€M 


sup ■ 

x,y&M 


k {l-C{x,y)f 
(1 -Ck{x,y)Y 


PxrHvW 
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converges to zero, with probability one. 

By Lemmas 5.1 and 5.2 we know that each of the first two factors here a.s. converge, 
uniformly over M, to one. So it suffices to show the convergence of the final factor to 
zero, or that 

lim sup j\\Pxr’'‘{y)f = 0, a.s. (10.61) 

Since we have already shown that f^{y) is a.s. bounded over M, the absolute value 
of its supremum has (on a large deviations scale) Gaussian-like tails, and so standard 
Gaussian arguments show that the maximum of k i.i.d. copies of this process can, 
asymptotically, be a.s. bounded by C^Jlog k for some finite C. 

Since Px is orthogonal projection onto an (m + l)-dimensional subspace of M^, it 
now follows that, for large enough k, 


1 

k 


\Pxf 


x.k 


< 


C{m + 1) log k 
k 


(10.62) 


from which (10.61) now follows, and we are done. 


11 Fluctuation Theory for Local Reaches 

We now return to the last part of Theorem 3.3, in which we described a fluctuation 
result involving the local reaches of the random manifolds h^{M). In particular, we 
want to consider the /c —)• oo distributional limit of the functions 

Vk {cot^ Oki-) - a^ifr)) (11.63) 

where 9k{x), defined by (3.18), is the local reach of h^{M) at the point h^{x), for 
X G M. 

The main result of this section is Theorem 11.1, which contains what is needed to 
complete the statement of Theorem 3.3, in that the limit process for (11.63) is now 
described in (formidable) detail. 

To make that detail appear a little more natural, we shall do a little algebra before 
stating the theorem. 


11.1 Some algebra and rearrangements 

From the proof of Lemma 4.2, we know that 


cot^ 9k{x) 


sup 

yeM\{x} 


{Rk{x,y) - 


(11.64) 


where is the ‘error’ term defined at (4.24) and we set 

k 


Rk{x,y) = 




WfHvW (1 -Cfc(x,y))2 \k 


We already know from Lemma 5.4 that E^’^{y) —)• 0 uniformly in x and y as k ^ oo. 
However, looking back over the proof, in particular the final inequality (10.62), we see 
that the same is true for \/kE^’^(y), from which it follows that we can ignore the error 
term in (11.64). In addition, since we also know from Theorem 3.3 that 


lim sup |i?fc(T,y) - Var(/‘^(?/))| = 0. 


(11.65) 
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Thus it seems not unreasonable that the structure of the limit of (11.63) might come 
from a continuous mapping theorem and the weak convergence of the random processes 
7 fc, where 

lk{x,y) = - Var(/^(y))), {x,y) £ M. (11.66) 

If we now recall/introduce the notations, 


and 





sf h.!/) 


k 


Zk{x,y) 

Bk{x,y) 

Nkiy) 


Vk 


I ^-C{x,y) 

yl -Ckix,y) 



Vk(j:^^\x,y)-Yarif^iy))^ , 

Vfe(sW(y)-l), 


then it takes no more than a few lines of simple algebra to check that 


(11.67) 


( 11 . 68 ) 

(11.69) 

(11.70) 


lk{x,y) 


Bk{x,y) 


Nk{y)Zk{x,y) 


sf(Ty) 

VkT.^k\y) 


Nk{y) + T.f’{x,y)Zk{x,y). 


(11.71) 


Now we wave our hands a little to come to some vagne conclusions, before stating 
Theorem 11.1 which will make these conclnsions precise, and then giving proofs. Firstly 
however, to reduce the lengths of some of the formulae to come, we recall some of our 
notational shorthand. 


C^y = C{x,y), Cf = Ck{x,y), 


and introduce 


V-y = Var(r(y)) 


1 _ (c^2y)2 _ j2^^{XiC^y)‘^ 

(1 -C^y)'^ 


(11.72) 


(11.73) 


(To see why the right hand side here is indeed Var (/^(y)), see (12.80) below.) 

Now consider the various terms in (11.71). Although we did not state it explicitly, 
we have actually already proven that Nk has a Gaussian limit. (See the discussion 
below in the proof of Lemma 12.1.) We also know, from Lemmas 5.1 and 5.3, that, as 
A: —)• oo, uniformly on M and M, respectively, 

sW(a;)“4-i and S®(x,y)4l^"T 

In addition, Lemma 5.2 and the a.s. convergence of lead to the expectation (this 
is the handwaving step) that Bk and Zk will both have Gaussian limits. Substitnting 
this ‘information’ into (11.71), the implication is that the first term on the right hand 
side will have a Gaussian limit on M, the second will converge to zero, the third will 
converge to V^y times a Ganssian process on M, while the last term will converge to 
V^y times a Gaussian process on M. Unfortunately, all the limit processes will be 
correlated, which is what makes the precise description of this a little long-winded, as 


we now see. 
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11.2 The fluctuation result 

Theorem 11.1. Let f and M satisfy the assumptions of Theorem 3.3, including the 
conditions that M is and that, with probability one, f G C^{M). Then there exists 
a sequence 7 ^ of random processes from M —)• M, such that, for all x G M, 

Vk\coi^ Okix) - al{f,x)\ < \%{x)\ (11.74) 

and a limit process 7 : M —)• M such that 

%{■) => 7(-)- (11.75) 

The convergence here is weak convergence, in the Banach space Cb{M) of continuous 
functions on M with supremum norm, and 

7 (x) = sup j{x,y), 

y£M\x 


where 7 is the a.s. continuous Gaussian process over M representable in distribution 
as 

l{x,y) = ^{x,y) + 2V'"yC,{x,y){l+C{x,y)). (11.76) 

Here 

1. rj{y) is a centered Gaussian process over M with correlation function 

^{v{yi)v{y2)} = 2{C{yi,y2)f. 

2. I3{x,y) is a centered Gaussian process over M with correlation function 

nP{xi,yi)P{x2,y2)} = 2{nn{yi)n{y2)}f. 


3. Cix,y) is a centered Gaussian process over M with correlation function 


^{C{xi,yi)C{x2,y2)} 

1 


(1 - (C^ 1 W) 2)(1 - (C^ 2 ?/ 2 ) 2 ) 

X ^ ^^yiy2'^2 ^ ^ (^C^iy2-^2^ 

(2^2yi _ (2^1X2(2X2y2'^ \^QXlX2 _ QXiy2QX2y2'^ 

_ QX\yi YlxiX2^xiy2 QViy2QX2yn^ 


As for the corresponding cross-covariance functions, we write them in terms ofXi ,..., Xm, 
an orthonormal (with respect to the induced metric) vector field on M. None of the 
cross-covariances are dependent on the particular choice of vector field. 


^{y{yi)P{x2,y2)} 

^{v{yi)Cix2,y2)} 


0 \r’yiy2 _ r’X2y2r’X2yi _ y' Y.fxyT_\ Ynxy2\ 12 
^ \y Z-zi \x=X2-^^l^ \x=X2\ 

(1 _ C^2y2y 

2C^2yicy^y2 _QX2y2 ^(^QX2yiy. _|_ ^^yiy2j2| 

1 _ (^Qx2y2Y ’ 
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'^{C{xi,yi)li{x2,y2)} 

1 


(1 - ((:* 2 ?/ 2 ) 2 ) 
2 


X *2 _QXiyiQXiX2 _ ' ^ ^ XiC^^^\x—x 

_j_ (QyiX2 _ ^xiyi^xiX2 _ y \ x—x XiC^^'^\x— 


Although Theorem 11.1 takes a lot of space to state, its main implication is simple: 
The limiting fluctuations of the local reach of h^{M) are bounded by a functional of 
a Gaussian process on M. The detailed structure of this Gaussian process is compli¬ 
cated, and depends, in terms of properties such as differentiability, on the underlying 
covariance of /. For example, while the limit is a.s. continuous, it will not typically be 
differentiable, and fine sample path properties such as Holder continuity will depend 
on the behaviour of the underlying covariance C in ways that are not at all obvious. 

12 Proof of Theorem 11.1 

We start with two lemmas, and then use these to complete the proof in the Section 

12 . 2 . 

12.1 Two Lemmas 

Lemma 12.1. Under the conditions of Theorem 11.1, and with the notation of the 
previous section, we have the joint weak convergence of the following vector valued 
process overCb{M): 

(sW, Sf, Hfc, iVfc, Zfc) ^ (1, F,/?,??, 2C(1+C)), (12.77) 

where V : M x M —)• M is defined by V{x,y) = . 

Proof. Most of the pieces that make up the proof of Theorem 11.1 are actually already 
in hand. For a start, by Lemmas 5.1 and 5.3 we know y) and converge to 

the deterministic limits and 1, respectively, where the convergence is almost surely 
uniform in (x, y) £ M and y G M. The corresponding weak convergence is, obviously, 
implied by this. Secondly, in Section 9.3.1 we established the weak convergence of Nj. 
in Cb{M) (cf. (9.56) and the last paragraph of Section 9.3.2). 

To add Zk to this convergence, note that the main term there - (1 — C{x,y))/{1 — 
Ck{x,y)) " already appeared in Section 9.1, and can be rewritten as in (9.40). Substi¬ 
tuting there the Ckix,y) of (9.43) and expanding out the powers, simple algebra leads 
to the fact that 


Zk{x,y) = 2{l + C{x,y))Ckix,y) + 0{l/Vk), (12.78) 

and we have already shown the weak convergence of (jk in Cb{M) (cf. (9.58)). 
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Note that to this point we have relied on results that arose in earlier parts of the 
paper, and these required only that / G The additional level of differentiability 

required by the lemma, and so also by Theorem 11.1, comes from the following lemma, 
which completes the collection by establishing the weak convergence of Bk- 

In view of the fact that all the limit processes are either deterministic or Gaussian, 
applications of Slutsky’s theorem and the Cramer-Wold device then complete the proof, 
modulo calculating all the the limit variances and covariances, for which we do not 
intend to write out the details. □ 


Lemma 12.2. Under the conditions of Lemma 12.1, and with the notation of the 
previous section, Bk ^ fi in Cb{M). 


Proof. The proof follows along the same lines as the proof of the weak convergence of 
ak described in Section 9.3.1. 

To start, we once again choose an orthonormal frame field {Xi} for M, with the 
conventions of Section 6. Write the corresponding Riemannian normal basis vectors 
as {d/dx^}, and [d/dx^ : d/dy^} as the corresponding basis for the tangent spaces on 
M X M. In this basis, we have 


riy) 


f{y)-C^yf{x) 

1 - c^y 


m dfjx) dC^y 

E dx'^ dx'^ 

I — Qxy ’ 

2=1 


and 

V^y =Xav{riy)) 


If we now write 


(1 -C^yy 


Mx,y) ^ (i-c^yf (^(ff(y)y-v^y), 


then we can also write 


Bk{x,y) 


k-^/^Ei=iMx,y) 

(1 - c^yy 


(12.79) 


(12.80) 


(12.81) 


Suppose we can show that the numerator here has a Gaussian limit. A, say, as k ^ oo. 
Since it is a sum of i.i.d. processes, this should not be too hard. To complete the proof 
of the weak convergence of the Bk we could then use a continuous mapping argument, 
as before, by defining a map, H say, between functions on M via 


{H{4>)){x,y) 


f>{x,y) 

(1 -C^yy 


(12.82) 


where the image function is in Cb{M). For this to work, we need to know that H is 
continuous, with probability one, for the probability measure supported on the paths 
of A. (This is not straightforward, since, as we shall soon see, we once again run into 
0/0 issues for {H{A)){x,y) as x ^ y.) As a first step in checking this continuity, we 
need to know something about A, and the function space on which the convergence of 
the numerator in (12.81) to A occurs. 

We start with A. Since, by assumption, it is mean zero Gaussian, all of its properties 
are determined by its covariance function. Given the expressions (12.79) and (12.80), 
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it is not hard to check that this is given by 


E{Af(a;i,yi)A£(x2,2/2)} 

= E{A(xi,yi)A(x2,y2)} 

_|_ QXlX2QXiyiQX2y2 


dx^ dx^ 




QQX2XI QQX2y2 


E 




()C^iy2 Qc^iyi 

dx^ 9x* 

QQxiyi Q^X2y2 q2^xiX2 

9x* 9x* dx^dx^ 


(12.83) 




(Note that setting x = xi = X 2 and y = yi = 1/2 here is what gives the numerator in 
the expression for in (11.73).) 

We can now consider the behaviour of 


v Mx,y) 
hm --—TTT. 

y^x (1 — C^yy2. 


(12.84) 


To see how this works, we restrict the argument to the case in which M is one¬ 
dimensional. While notationally much simpler than the general case (although we shall 
see in a moment that it is hardly ‘simple’) it is indicative of the general situation. In 
the general case the limit in (12.84) will be taken along a specific path of y’s, for which 
the final direction of approach to x will be what plays the role of the single dimension 
in the following calculation. 

Taking then x,y & M C it is an immediate consequence of (12.83) that the 
variance of A(a:, y) tends to zero as y —)• x, and thus so does A(x, y) itself. The 
denominator here clearly also converges to zero, and so to compute the ratio we need 
to resort to an application of L’Hopital’s rule, which gives us that the limit in (12.84) 
is the same as 


lim 


dA{x,x) 

dx 


y-^i _2(1 - 


' dx 


(12.85) 


Once again, it is obvious that the denominator vanishes in the limit. 

As for the numerator, it follows from (12.83) and the fact that y-norm of ^ is one 
that 


E 


' dL{x,y) 

dx 

- ^ ^ 
dyi dy 2 


y=x 


E{L{x,yi)L{x,y 2 )}\ yi=y2=x 


— 4 I Qyiy 2 _ QxyiQxy 2 _ 


dcxyi Qcxy2 




+ 4 



'd'^cy^y^ 

. dyidy2 

dcy-^y2 

dc^yi ^ 

dyi 

dyi 

< 

' dcy^y^ 

. dy2 


dx dx J 

dc^yi Qcxy2 


d'^C^yi d^C^y^ 


Qxy2 _ 


dyi dy 2 dyidx dy 2 dx 

d^C^yi QCxy2 \ 

dyidx dx J 
dC^y^ ^xy, Qc^yi \ 

dy 2 dy 2 dx dx J 
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However, evaluated at yi = y 2 = x, this also vanishes, so yet another application of 
L’Hopital’s rule is required. 

In fact, two more applications of L’Hopital’s rule are required, and while the deriva¬ 
tion follows the line of the previous applications, the formulae are rather long, and so 
we will skip the details. However, in the end, one finds that 


,. Mx,y) 

iim ---- 

y^x (1 — C^y)^ 


d'^L{x,x) 

dx'^ 

( 

I dx'^ 


2 ’ 


where the variance of the numerator is 


72 


d^c^x 

dx'^ 



( 12 . 86 ) 


(12.87) 


which, like the denominator of (12.86) is non-zero. (This is a consequence of the 
non-degeneracy assumed in Assumption 3.2.) 

The punch line to all this is that in order to apply the continuous mapping theorem 
with the mapping H of (12.82), we need to have convergence not only of the sum 
^- 1/2 ^ gf j^g (lerlvutlves. That is, we need weak convergence 

in the Banach space of four times continuously differentiable functions on M x M, 
equipped with the norm 

11/11^(4) = max {ll/IU, IIV/IU, ||V2/||oo,||VVl|oo, ||VV||oo} , 


(cf. (9.47)). 

Now that we know what to do, the rest is, at least in principle, straightforward, 
and the proof follows along the same lines of the proof of the weak convergence of 
ctfc we treated in Section 9.3.1. Convergence of finite dimensional distributions follows 
from Theorem 9.2, while tightness requires the computation of moments of increments 
of the Ai and their first four derivatives. Note, however, that A(^{x,y) involves ff{y)- 
Since we have seen that as a function on M, basically possesses one less level 

of differentiability that / itself, requiring four derivatives for Ai ultimately leads to 
requiring / G C^{M). In addition, since the arguments is Section 9.3.1 relied on a 
Taylor expansion, one further derivative is required, which is why the lemma, and so 
Theorem 11.1, require / G C^{M). 

We leave the details to the reader. While they are long and involved, the fact that 
all random variables are either Gaussian or squares of Gaussians means that there is 
no more involved than Wick’s formula and accounting. □ 


12.2 Proof of Theorem 11.1 

From (11.64), (11.66) and the definition (3.15) of a‘^{f,x)) we have that 


y/k |cot^ Okix) 



= Vk 

sup (Riy-E^^\y))- 

sup v^y 


yeM\{x} ^ ^ 

yeM\{x} 

VI 

sup - sup v^y 

-b y/k sup 


y€^M\{x} yGM\{x} 

yeM\{ 

< sup 7 fc(x,y) -b y/k sup 

\E^’Hy)\- 


yGM\{x} y£M\{x} 
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From the discussion preceding (11.65) we know that we can ignore the second term 

here in the limit. The representation of 7 ^ in (11.71) in terms of the processes 
( 2 ) 

S).' Nk and the joint weak convergence of all of these in Lemma 12.1, and 
an application of the continuous mapping theorem, complete the proof of Theorem 
11 . 1 . □ 

Appendix 1 

We now give a proof of Lemma 4.1. As mentioned earlier. Lemma 4.1 is identical to 
Lemma 2.1 of [19] and, as pointed out there, the proof is essentially the same as the 
proof given in [10] for the one-dimensional case. Thus, we make no claim of originality, 
and include the proof for completeness only. 

Proof. Take S T^M n S{TxS^~^), and consider the geodesic 

7 (x,» 7 ,,)(^) = cosrx + sinrr/j;, r > 0. 

To determine the local reach in the direction rjx, we need to know how far we can 
extend 7 so that the metric projection of the endpoint is x. Clearly, this is until we 
find a y X such that 


Consider first the case of r < ^ so that cos r > 0. Then the above two formulae imply 
that we can extend the geodesic at least until such an r as long as 


sup (cos r {{y,x) — 1) + sin r{y, rjx)) < 0 

y^x 


sup 

y^x 

(— COS r (1 ■ 

sup 

y^x 

cot r -b 

sup 

y^x 

(y, yx) 

1 - {x,y) ' 


{y,Vx) 

1 - {x,y) 

< cot r. 


< 0 


When r > ^, the same argument gives that as soon as there is a y such that {y, rjx) > 0, 


cosr (1 - (y, 3 :))-b sinr (y, 7 a;) > 0 , 


and therefore, for such r. 


sup (cosr ((y,a;) - 1 )-b sinr(y, 7 a;)) > 0 , 

y^x 

implying that such a y is closer to 'yx,r]:,: (i") than x is. Hence, the geodesic can only be 
extended up to a length less than or equal to 

Thus, by our earlier argument for r < ^, we note that on the set 

Z = {ix,r]x) ■ sup{y,r]x) > O}, 

y^x 


the critical radius satisfies 


= sup 


{y, Vx) 

y^x 1 {x, y) 


sup 

yi^x 


{y,rix)^ 

1 - {x,y) 


cotr(x,77a;) 


> 0 
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where x'^ denotes the positive part of x. Meanwhile, on the set we have 

cot r{x, rjx) < 0. 


Therefore, we have the inequality. 


cot r(x, rjx) < sup 


{y^VxY 




which becomes an equality when 


sup 

yj^x 


{y,Vx)^ 

1 - {x,y) 


> 0 . 


In other words, we have 


cot (^min (r{x,r]^), 


{y,Vx)^ 
1 - {x, y) 


= sup 


Finally, since M is a closed manifold embedded into a sphere, the local reach at x, 
which is an infimum over all rjx above, cannot be greater than Thus we can truncate 
at this angle, and so, by (2.7), (2.8), and the above, obtain that 


cot^(0(x)) 


cot 


inf ^^(a;, r^a;) 

Vx 


sup sup 

'nx-\\vx\\=^ 


{y,Vx)^ 

1 - {x,y) 


W^TXAdyW^ 

'y'^x (1 - {X,y)) 


sup 


2 ’ 


2 


as required. 


□ 


Appendix 2 

We need to show that the remainder terms, 0{l/k‘^), in (9.37) and (9.38) are of the 
right order and, just as importantly, are uniform over M x M. 

As mentioned earlier, if the correlation estimates C{x,y) of C{x,y) were centered 
at sample means rather than zero - which we shall refer to as the ‘standard’ case - we 
could simply quote known results from the Statistics literature to establish everything 
we need. These results are not hard to prove, but they involve pages of tedious algebra, 
which we do not want to try to reproduce here. Rather, we shall suffice with describing 
the standard proofs, and where changes need to be made to cover our situation. 

The standard case is treated in [12]. Following the derivation in Chapter 16, Sec¬ 
tion 16.24 there, we start by writing out the joint probability of k sample values 
{{fj{x),fj{y))}j=i drawn from a bivariate normal density with zero means and unit 
variances in terms of the statistics we are interested in, namely, 

^2 = l^fjiy), 

along with the Ck{x,y) of (4.23). These replace the standard sample mean centered 
version of these statistics in [12]. 

Using the result of Example 11.6 in Chapter 11 of [12] which deals with finding 
the distribution of a sum of squares of i.i.d. standard normal variates, and following 
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the discussion in Section 16.24 there, we find that the exact joint probability density 
of si,S 2 ) and Ck{x,y), on M_|_ x M_|_ x [— 1 , 1 ], is given by 


k^s'l ^S2 ^{1-Cl{x,y)) 2 ___|__(52_2c(a;,y)Cfc(a:,j/)siS2+sl) 

7 rr(A: — 1)(1 — C‘^{x, y))^/"^ 


As in Section 16.32 of [ 12 ], we now integrate out si and S 2 , and use the remaining 
density of C to compute that 

^ C{x,y)T\{k+ l)/2) n_ 1 k_^ \ 

r(A:/2)r((A: + 2)/2) V2’2’ 2 


E{Cfc( 


where F is the hypergeometric function. Note that 0 < C‘^{x,y) < 1 for all (x,y) G 
M X M. The fact that 


F{a, /3, 7 , x) = 1 + xO(l/ 7 ) as 7 —)• oo 

uniformly for x in any bounded set (cf. [18]), and a Stirling’s approximation which 
gives that the ratio of Gamma functions in (12.88) converges to 1 as A; ^ 00 , gives the 
uniformity of the error bound in (9.37). A similar calculation establishes (9.38) and 
the uniformity of the error bound there. 
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